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Abstract 

We  consider  optimal  dynamic  multidrug  therapies  for  human  immunodeficiency 
virus  (HIV)  type  1  infection.  In  this  context  we  describe  an  optimal  tracking  problem 
attempting  to  drive  the  states  of  the  system  to  a  stationary  state  in  which  the  viral 
load  is  low  and  the  immune  response  is  strong.  We  consider  optimal  feedback  control 
with  full  state  as  well  as  with  partial  state  measurements.  In  the  case  of  partial  state 
measurement,  a  state  estimator  is  constructed  based  on  viral  load  and  T-cell  count 
measurements.  We  demonstrate  by  numerical  simulations  that  by  anticipation  of  and 
response  to  the  disease  progression,  the  dynamic  multidrug  strategy  reduces  the  viral 
load,  increases  the  CD4+  T-cell  count  and  improves  the  immune  response. 
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1  Introduction 


During  the  last  two  decades  medical  treatment  for  human  immunodeficiency  virus  (HIV) 
has  greatly  improved.  Typically  therapy  can  prolong  time  to  onset  of  acquired  immune  de¬ 
ficiency  syndrome  (AIDS)  for  tens  of  years.  The  prevailing  medical  practice  is  to  prescribe 
highly  active  antiretroviral  therapy  (HAART)  which  can  reduce  viral  load  and  maintain 
high  CD4+  T-cell  counts.  This  therapy  involves  combinations  of  three  or  more  drugs 
that  are  called  “cocktails”.  However,  in  spite  of  the  success  of  HAART,  some  patients 
develop  resistance  to  one  or  more  of  the  drugs  in  long  term  use.  In  these  cases,  it  is 
necessary  to  change  the  composition  of  HAART.  In  addition,  there  may  be  severe  side 
effects  from  the  medication.  Moreover,  in  developing  countries,  the  expense  of  HAART  is 
often  prohibitively  high.  Motivated  by  these  and  other  reasons,  the  search  for  alternative 
treatments  is  very  active.  In  this  paper  we  study  dynamic  multidrug  therapies  that  can 
lead  to  long-term  control  of  HIV  by  the  immune  response  system  after  discontinuation  of 
drug  treatment. 

HIV  infects  CD4+  T-cells  (a  fundamental  component  of  the  human  immune  response 
system)  and  other  target  cells,  hijacking  their  replication  mechanisms.  The  infected  cells 
then  produce  a  large  number  of  copies  of  the  virus.  Currently  the  two  most  important  cate¬ 
gories  of  anti-HIV  drugs  are  reverse  transcriptase  inhibitors  (RTIs)  and  protease  inhibitors 
(Pis).  A  typical  HAART  cocktail  consists  of  one  or  more  RTIs  and  a  PI.  The  reverse 
transcriptase  inhibitors  prevent  HIV  from  infecting  cells  by  blocking  the  integration  of  the 
viral  code  into  the  target  cells.  Protease  inhibitors  interfere  with  the  replication  of  viruses 
by  infected  cells.  Virions  may  still  be  produced,  but  they  are  generally  non  infectious;  that 
is,  they  are  not  capable  of  infecting  new  target  cells.  In  practice,  RTIs  cannot  completely 
block  the  virus  integration  of  the  DNA  in  target  cells.  Also,  some  infectious  virions  are 
produced  under  PI  medication.  Every  drug  has  a  maximum  efficacy  which  depends  on 
many  factors  such  as,  for  example,  viral  strains  present.  One  might  expect  that  the  effec¬ 
tiveness  of  HIV  therapy  could  be  improved  by  developing  dynamic  multidrug  strategies, 
where  the  combination  of  drugs  given  to  HIV  patients  changes  over  time  in  response  to  the 
individual’s  disease  progression. 

A  number  of  different  mathematical  models  based  on  systems  of  differential  equations 
have  been  developed,  see  for  example  [24] .  Some  of  these  models  used  to  design  dynamical 
drug  treatments  are  presented  in  [1,  8,  24,  25,  29,  30,  31].  In  the  long  term  pathogenesis  of 
HIV  an  immune  response  can  play  an  important  role.  However,  the  models  in  [24,  25]  do 
not  contain  immune  response  while  the  authors  in  [1,  8,  29,  30,  31]  do  consider  the  immune 
response.  Since  immune  mechanisms  responding  to  HIV  are  not  yet  very  well  understood, 
various  immune  response  models  have  been  proposed  in  the  literature.  In  this  study,  we 
employ  a  model  based  on  the  models  considered  in  [1,  2]  which  contain  an  immune  effector 
component. 

Optimal  treatment  of  HIV  infection  using  a  control  theoretic  approach  is  the  subject  of 
substantial  research  activity.  The  papers  [1,  3,  10,  13,  14,  26]  consider  only  RTI  medication 
while  the  papers  [20,  22]  consider  only  Pis.  In  [33,  34,  35]  all  effects  of  a  HAART  medication 
are  combined  and  represented  by  one  control  variable  in  the  model.  In  [2,  11,  17,  21,  28] 
dynamical  multidrug  therapies  based  on  RTIs  and  Pis  are  designed.  In  these  proposed 
therapies  the  dosage  of  both  medications  can  change  independently  of  each  other  and  can 
either  be  continuous  or  on-off  types.  Studies  of  continuously  varying  medical  therapies  have 
been  more  common,  see,  e.g.,  [1,  3,  9,  10,  11,  13,  14,  17,  18,  20,  22,  28,  30].  More  recently, 
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the  on-off  type  of  treatment,  which  is  also  known  as  a  structured  treatment  interruption 
(STI),  has  attracted  a  lot  of  attention  in  the  medical  literature  (see  for  example,  [1,  2,  4] 
and  the  references  therein) .  A  primary  argument  for  use  of  STI  therapy  instead  of  continu¬ 
ously  varying  dosage  is  that  one  might  lower  the  risk  of  HIV  mutating  to  strains  which  are 
resistant  to  the  current  medication  regimen.  Recent  results  on  structured  treatment  inter¬ 
ruption  schedules  including  optimal  treatments  are  presented  in  [1,  2,  21,  34,  35].  In  this 
paper  we  consider  optimal  feedback  treatment  of  HIV  infection  by  continuously  varying 
dosages  of  RTIs  and  Pis  in  a  nonlinear  model  including  an  immune  response. 

There  are  a  number  of  control  techniques  that  can  be  utilized  to  design  dynamical 
therapies  for  HIV.  Open  loop  control  has  been  employed  in  [1,  2,  10,  13,  14,  18,  20,  21,  26,  28] 
and  feedback  control  has  been  used  in  [3,  9,  11,  26,  33,  34,  35].  The  papers  [3,  9,  33]  consider 
the  feedback  control  based  on  partial  measurements.  An  estimator  is  employed  to  construct 
the  control  in  [3]  while  in  [33]  this  possibility  is  only  briefly  mentioned.  A  linear  quadratic 
regulator  (LQR)  control  based  on  Riccati  equations  is  studied  in  [11], 

For  linear  systems,  LQR  is  a  well-known  and  accepted  methodology  for  the  synthesis 
of  control  laws.  However,  most  mathematical  models  for  biological  systems,  including 
the  HIV  dynamics  with  immune  response  as  studied  in  this  paper,  are  nonlinear.  One 
of  the  promising  and  emerging  methodologies  for  designing  nonlinear  controllers  is  the 
state  dependent  Riccati  equation  (SDRE)  approach  in  the  context  of  nonlinear  regulator 
problems  (see  for  example,  [6,  7,  16,  23]).  In  essence,  the  SDRE  method  is  a  systematic  way 
of  designing  nonlinear  feedback  controllers  by  factoring  the  state  dependent  nonlinearity  of 
the  state  equations  as  a  product  of  a  state  dependent  matrix  with  the  state  vector.  That 
is,  by  using  direct  parameterization  the  nonlinear  system  is  brought  to  a  linear  structure 
with  state  dependent  coefficient  matrices.  This  parametrization  is  however  not  unique 
and  thus  some  flexibility  in  design  is  permissible.  The  state  feedback  control  law  is  then 
given  in  terms  of  the  solution  of  a  state  dependent  Riccati  equation.  As  shown  in  [7] ,  the 
SDRE  method  is  a  powerful  approach  that  is  readily  applicable  to  the  nonlinear  tracking 
and  nonlinear  state  estimation  problems,  since  it  is  closely  related  to  the  algebraic  Riccati 
equation-based  method  used  to  find  the  feedback  controls  in  the  linear  cases. 

While  the  SDRE  method  has  been  applied  earlier  to  mostly  engineering  type  problems 
such  as  flight  dynamics  simulation  [7]  and  chemical  vapor  deposition  [5] ,  the  idea  of  using 
SDRE  for  combined  drug/immune  response  control  of  HIV  infection  as  presented  here  is 
new.  In  addition,  we  propose,  in  this  paper,  a  more  systematic  approach  to  parametrization 
of  the  nonlinear  system  as  a  linear  structure  with  state  dependent  coefficient  matrices.  In 
our  approach,  the  state  dependent  matrix  is  the  Jacobian  of  the  nonlinear  system  dynamics. 
Our  parametrization  choice  together  with  the  proposed  time  discretization  method  imply 
that  the  state  dependent  coefficient  matrix  is  in  fact  the  exact  local  linearization  of  the 
nonlinear  state  dependent  system  dynamics  at  the  current  state  of  the  system. 

The  outline  of  the  paper  is  as  follows.  We  begin  in  Section  2  with  a  description  of 
a  rather  complex  HIV  model.  An  optimal  quadratic  tracking  problem  is  formulated  in 
Section  3  to  drive  the  state  of  the  system  to  a  stationary  state  which  we  call  the  “healthy” 
state  (it  has  low  viral  load  and  high  CD4+  T-cell  count).  Section  4  contains  a  direct 
factorization  of  the  nonlinear  system  into  a  linear  form  with  state  dependent  coefficient 
matrices.  By  then  mimicking  standard  LQR  formulation  for  linear  systems,  a  suboptimal 
feedback  control  is  derived  in  Section  5.  Sections  6  and  7  present  the  formulation  of  the 
state  estimator  for  more  realistic  control  problems  where  only  partial  state  measurements 
are  available  for  feedback.  In  Section  8  and  9  we  summarize  the  numerical  procedure 
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for  the  synthesis  of  the  suboptimal  feedback  controls.  Numerical  simulation  results  are 
summarized  in  Section  10  and  concluding  remarks  are  presented  in  Section  11. 


2  HIV  Model 

A  modification  of  the  mathematical  model  as  presented  in  [1,  2]  for  the  pathogenesis  of  HIV 
is  adopted  for  this  paper.  We  include  an  additional  state  variable  for  non  infectious  virus 
[12,  24].  The  model  is  essentially  based  on  the  two  target  model  proposed  by  Callaway  and 
Perelson  in  [12]  without  making  distinction  between  short  and  long  lived  infected  target 
cells.  In  addition,  it  contains  an  immune  response  component  similar  to  the  one  described 
by  Bonhoeffer,  et  al.,  in  [8]  with  a  Michaelis-Menten  type  saturation  nonlinearity.  Our 
model  captures  many  of  the  observed  behavioral  properties  of  HIV  dynamics  [8,  12]  and, 
furthermore,  its  mathematical  properties  are  well-suited  for  designing  multidrug  therapies 
using  control  theoretic  approaches  [1,  2],  The  purpose  of  adding  non  infectious  virus  to  the 
model  is  to  reflect  what  is  actually  being  measured  in  clinical  data.  The  viral  load  mea¬ 
surements  cannot  differentiate  between  infectious  and  non  infectious  virus,  detecting  only 
the  total  amount  of  virus.  Hence,  in  order  to  construct  a  state  estimator  (for  developing 
dynamic  multidrug  therapy  based  on  the  disease  progression)  using  such  measurements, 
we  need  to  model  also  the  non  infectious  virus  population.  It  should  be  noted  that  the 
inclusion  of  this  additional  state  does  not  effect  the  dynamics  of  the  other  state  variables. 

The  dynamics  of  our  HIV  model  are  described  by  the  set  of  ordinary  differential  equa¬ 
tions: 


T\ 

T2 

t; 

T* 

Vi 

Vni 

E 


Ai  —  d\Ti  —  (1  —  ei)k\VjTi 

A2  -  d2T2  -  (1  -  fe i)k2VTT2 
(1  -  t\)kxVjT\  -  ST?  -  rri\ET? 

(1  -  fe i)fc2V/T2  -  6T*  -  m2ET* 

(1  -  e2 )Nt6(T?  +  T2*)  -  [c  +  (1  -  e\)p\k\T\  +  (1  -  fe1)p2k2T2\  Vj 
e2NT5(T?  +  T2*)  -  cVNI 


A  e  +  bE 


t:  +  t; 


t*  +  r2*  +  Kb 


E  —  dE 


T*  _j_  T* 


T*  +  T*  +  Kd 


E  -  5eE. 


(1) 


In  the  model  (1),  the  state  variables  are:  Ti,  the  uninfected  CD4+  T-cells;  T2,  the  unin¬ 
fected  target  cells  of  second  kind;  Tf,  the  infected  T-cells;  Tf,  the  infected  target  cells  of 
second  kind;  Vj,  the  infectious  virus;  Vni,  the  non  infectious  virus;  and  E,  the  immune 
effectors.  The  controllers  ei  and  e2  represent  the  RTI  and  PI  “efficacies”,  respectively. 
We  do  not  give  precise  biological  definitions  for  the  target  cells  of  second  kind  and  the 
immune  effectors.  They  could,  for  example,  be  related  to  macrophages  and  cytotoxic  T- 
lymphocytes,  respectively.  For  a  more  detailed  description  of  the  variables  and  rationale 
for  the  model  (1)  we  refer  the  reader  to  the  articles  [1,  2],  Table  1  contains  the  values  of 
parameters,  which  are  the  same  as  those  used  in  [1,  2],  The  only  difference  is  that  we  use 
mm3  (cubic  millimeter)  as  our  unit  volume  instead  of  ml  (milliliter). 

In  order  to  simplify  our  subsequent  discussions,  we  introduce  the  notation  x  and  u  to 
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parameter 

value 

unit 

parameter 

value 

unit 

Ai 

10.0 

cells 

A2 

31.98  x  10”3 

cells 

mm3 -day 

mm3 -day 

d\ 

0.01 

1 

day 

d2 

0.01 

1 

day 

h 

8.0  x  10~4 

mm3 

k2 

0.1 

mm3 

virions- day 

virions -day 

m\ 

0.01 

mm3 

m2 

0.01 

mm3 

cells- day 

cells- day 

Pi 

1 

virions 

P2 

1 

virions 

cells 

cells 

5 

0.7 

1 

day 

c 

13.0 

1 

day 

f 

0.34 

- 

Nt 

100.0 

virions 

cells 

A  E 

1.0  x  10~3 

cells 

mm3 -day 

0.1 

1 

day 

t>E 

0.3 

1 

day 

dE 

0.25 

1 

day 

Kb 

0.1 

cells 

mm3 

Kd 

0.5 

cells 

mm3 

Table  1:  The  values  of  the  parameters  in  the  HIV  model. 


denote  the  state  and  control  vectors,  respectively.  Thus  we  define 

/M 

T2 


x  = 


and 


u  = 


V i 

Vni 

\E  ) 

With  the  above  notation,  the  HIV  model  (1)  can  be  expressed  in  the  generic  form 

x  =  f(x)  +  B(x)u ,  (2) 

where  the  precise  representation  for  f(x)  and  B(x)  will  be  described  later  in  Section  4. 

For  feedback  control  we  need  current  knowledge  on  the  state  of  the  system.  In  our 
effort  here  we  assume  that  partial  state  observations  (Tf  +  T*,  Vj  +  Vni)  are  available. 
This  is  representative  of  the  type  of  clinical  data  widely  discussed  in  the  literature  (see  for 
example,  [1]).  Hence,  the  output  or  observation  takes  the  form 


z  = 


x  =  Cx,  (3) 

where  z i  and  z2  represent  the  total  CD4+  counts  and  the  total  viral  loads,  respectively. 


3  An  Optimal  Tracking  Problem  Formulation 


With  the  parameters  given  in  Table  1  and  the  assumption  of  no  medication  (ei  =  e-2  =  0), 
the  model  exhibits  several  steady  states.  These  are  described  and  analyzed  in  [1,  2]  with  the 
non  infectious  virus  Vni  state  always  being  zero  in  these  steady  states.  We  are  particularly 
interested  in  the  so-called  “healthy”  steady  state  given  by 
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cells 


V/  =  0.415 


mmJ 

virions 


mmJ 


T2  =  0.621 


Vni  =  0.0 


cells 

mm^ 

virions 


T*  =  0.076 


cells 


and 


rnrnJ 


E  =  353.108 


T*  =  0.006 
cells 


cells 


mmJ 


(4) 
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mmJ 
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which  was  also  shown  to  be  locally  asymptotically  stable.  This  means  that  after  a  suf¬ 
ficiently  small  perturbation  from  (4),  the  trajectory  of  the  state  x  returns  to  the  stable 
equilibrium  (4).  It  is  noted  that  this  stable  equilibrium  exhibits  a  strong  immune  response, 
low  viral  load  and  reasonably  large  target  cell  (Tj)  counts. 

Highly  active  antiretroviral  therapy  (HAART)  has  proven  to  be  very  effective  at  re¬ 
ducing  the  viral  load  to  below  detectable  levels.  However,  sustainable  suppression  has 
proven  to  be  difficult  using  HAART  because  patients  taking  these  drugs  experience  ad¬ 
verse  side  effects  that  make  adherence  to  therapy  very  difficult.  In  this  paper  we  adopt  a 
control  theoretic  approach  to  find  a  suboptimal  treatment  strategy  that  can  lead  to  high 
immune  effector  levels  and  subsequent  control  of  viral  load  without  the  need  for  further 
drug  therapy. 

We  formulate  the  problem  of  finding  an  effective  multidrug  therapy  as  a  tracking  prob¬ 
lem.  To  this  end,  we  define  the  objective  functional 

1  r°° 

J{x,  u)  =  -  {( V /  -  0.415)2  +  10 (E  -  353.108)2  +  +  (e2/e^)2}dt,  (5) 

4  JO 

where  Vj  is  the  number  of  free  virus  and  E  represents  the  immune  response.  The  control 
variable  e\,  where  0  <  e\  <  e™ax,  denotes  the  “efficacy”  of  the  reverse  transcriptase  in¬ 
hibitor.  Similarly,  the  control  variable  62,  0  <  62  <  e™3*,  represents  the  “efficacy”  of  the 
protease  inhibitor.  We  note  that  throughout  we  use  the  somewhat  nonstandard  terminol¬ 
ogy  “efficacy”  interchangeably  with  the  control  level  for  the  two  drugs.  The  weights  in  (5) 
have  been  determined  a  priori  through  a  series  of  numerical  experiments.  The  feedback 
control  algorithm  proposed  in  Section  5  is  not  particularly  sensitive  to  the  choice  of  these 
weights. 

We  denote  the  control  vector  u  by 


u  = 


We  also  introduce  the  following  two  vectors 


u  = 


and 


u  = 


The  optimal  tracking  control  problem  is  to  find  a  dynamic  multidrug  therapy  u(t)  satisfying 


min  J(x(t),u(t ))  (6) 

u<u(t)<u 

subject  to  the  state  equation  given  by  (2)  with  initial  condition  x(0)  =  xq.  We  could,  of 
course,  consider  the  restricted  class  of  scaler  controllers  u  where  ej(i)  =  ejit(i).  While  this 
is  the  more  usual  treatment  protocol  in  clinical  practice,  we  are  interested  in  investigation 
here  of  the  more  flexible  scenario  where  the  RTI  and  PI  levels  can  vary  independently. 
We  also  note  that  although  we  formulate  a  continuous  feedback,  current  clinical  practice 
involves  discrete  observations  and  hence  our  approach,  if  implemented,  would  have  to  be 
approximated  by  some  type  of  interpolated  estimate  of  the  observed  states  (see  especially 
the  computational  examples  below  where  impractical  observation  sampling  is  used). 

We  note  that  our  mathematical  model  for  HIV  dynamics  (2)  is  nonlinear.  One  of  the 
highly  promising  and  emerging  techniques  for  designing  nonlinear  feedback  controllers  is 
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the  state-dependent  Riccati  equation  (SDRE)  approach  in  the  context  of  the  nonlinear 
regulator  problem.  This  technique  essentially  uses  direct  parametrization  to  rewrite  the 
nonlinear  state  function  f(x)  in  (2)  as  a  product  of  a  state-dependent  coefficient  matrix 
with  the  state  vector.  This  parametrization  is,  of  course,  not  unique  and  one  obvious 
factorization  is  given  in  terms  of  the  local  linearization  of  the  state  equation. 


4  A  Local  Linearization  of  the  State  Equation 


To  design  a  nonlinear  feedback  controller  using  the  SDRE  methodology,  one  usually  first 
rewrites  the  nonlinear  dynamics  (2)  in  the  state-dependent  coefficient  form  f(x)  =  A(x)x. 
Underlying  this  form  is  the  tacit  assumption  that  /( 0)  =  0.  Our  HIV  model  (1)  does  not 
satisfy  this  condition.  For  this  reason  we  divide  f(x)  into  two  parts 

f(x)  —  a  and  a, 

where  a  is  chosen  in  such  a  way  that  /( 0)  —  a  =  0.  The  simplest  choice  is  a  =  /( 0). 
However,  instead  of  using  this  simple  choice  we  let  a  depend  on  x  and  we  denote  this 
dependence  by  a(x).  The  choice  of  a(x)  will  become  clear  below. 

We  thus  rewrite  the  state  equation  (2)  as 

x  =  A{x)x  +  a(x)  +  B(x)u.  (7) 

That  is,  f(x)  is  parameterized  as 

f(x)  =  A(x)x  +  a(x).  (8) 


It  has  been  noted  [7,  16,  23]  that  the  choice  of  A(x)  in  systems  where  /( 0)  =  0  is 
not  unique  and  the  same  also  holds  for  our  generalized  formulation.  However,  for  this 
formulation  a  natural  choice  of  the  matrix  A(x)  is  the  Jacobian  of  f(x).  That  is, 


A{x) 

_  df(x) 
dx 

and  a(x 

)  =  fix)  - 

A(x)x. 

For  our 

particular  HIV  model  (1) 

the  Jacobian 

is  given  by 

(~di  -  kxVj 

0 

0 

0 

-fciTi 

0 

0  \ 

0 

— d2  —  k^Vj 

0 

0 

~k2T2 

0 

0 

k\Vj 

0 

—5  —  m\E 

0 

fciTi 

0 

-miT* 

A{x)  = 

0 

k2Vj 

0 

—5  —  m2E 

k2T2 

0 

-m2T2* 

-pikiVi 

—fXikaVj 

Ne5 

Ne5 

455 

0 

0 

0 

0 

0 

0 

0 

— c 

0 

V  o 

0 

473 

474 

0 

0 

477  / 

where 

^55  = 

-c  -  p\kxT\  - 

P'2  k'2  T'2 , 

4-73  =  A7  4 


bEKbE 

(T*  +  T*  +  Kb )2 


dEKdE 


and 


A  77  — 


bE 


dE 


T*  +  T*  +  Kb  T*  +  T*  +  Kd 


(Ti*  +  T2*)  -  SE. 
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Finally,  the  matrix  B(x )  in  (7)  is  given  by 


B(x) 


(  h  ViTx 

fk2VIT2 
— WTi 
—  fk2VIT2 

p\k\VjT\  +  fp2k2ViT2 
0 

V  0 


0  \ 
0 

0 

0 

- NtS{T \*  +  T2*) 
Nt8(T j*  +  T9*) 

0  “ 


5  Optimal  Feedback  Controllers 


In  this  section  we  formally  derive  the  nonlinear  feedback  controllers  for  the  optimal  tracking 
problem  described  in  Section  3.  We  first  will  assume  that  all  state  variables  are  available 
for  feedback.  Our  approach  is  to  use  the  SDRE  technique  and  mimic  the  standard  linear 
quadratic  regulator  formulation  for  linear  systems. 

We  begin  by  rewriting  the  objective  functional  in  (5)  in  the  generic  form 


J(x,  u ) 


y)TQ(x 


\  rr 

y)  +  u  Ru 


dt, 


where  the  dynamic  tracking  variable  y  represents  the  stable  equilibrium  state  (4). 

The  Hamiltonian  for  our  optimal  control  problem  is  given  by 

H( x,  u,p )  =  ^  (x  -  y)T  Q  (x  -  y)  +  ^uTRu  +  pT  (A(x)x  +  a(x)  +  B(x)u) 

—  wT(u  —  u)  —  wT(u  —  u ), 

where  the  penalty  multiplier  vectors  w  and  w,  are  introduced  to  account  for  the  constraints 
on  the  control  variable  u,  are  non  negative  and  satisfy  the  conditions 


wT(u  —  u)  =  0 


at  the  optimal  control.  From  the  Hamiltonian,  the  necessary  conditions  for  optimality  are 
found  to  be 


dl~L 

x  =  A(x)x  +  a(x)  +  B(x)u  =  — — , 

op 


p  =  -Q(x  -  y)  - 


d  {A{x)x) 
dx 


P~ 

dl~i 

0  =  Ru  +  BT(x)p  —  w  +  w  =  — — . 

ou 


d(a(x)) 

dx 


P 


d  ( B{x)u ) 
dx 


p  = 


— (») 


dx 


Let  Aj-  denote  the  7th  row  of  A{x)  and  denote  the  7th  row  of  B(x).  Then 


d{A{x)x)  d{A{x)) 

=  A{x)  H - t - x 


dx 


=  A(x)  + 


dx 

'  dAl; 
dx  i 

dAn: 

.  dx  i 


X 


dAl ; 
dxn  ' 


dAn 

dxn 
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and 


d  ( B{x)u ) 
dx 


dx i  u  '  ' 

dB  I, 
dXn 

dBn: 

L  dxi  u  '  ' 

dBn-. 

dxn 

The  last  equation  of  (9)  leads  to  the  optimal  control 

u  =  —R-1  ( BT(x)p  —  w  +  id)  . 


(11) 


Mimicking  the  tracking  problem  for  linear  systems,  we  assume  that  the  adjoint  state  has 
the  form 

p  =  Il(x)x  +  b(x),  (12) 

where  II  is  a  matrix,  6  is  a  vector  and  both  are  state  dependent.  To  find  II  (x)  and  6(x ), 
we  differentiate  (12)  with  respect  to  time  along  a  trajectory  and  substitute  the  optimality 
condition  (9)  for  p  to  obtain 


-Q{x  -  y)  - 


d  (A(x)x) 
dx 


d(a(x)) 

dx 


P 


d  (B(x)u) 
dx 


p  =  tlx  +  ILx  +  6,  (13) 


where  we  use  the  notation 

n 

ri(‘T)  =  J2UXi(x)xi(t). 

i— 1 

A  similar  definition  also  holds  for  b.  Substituting  the  expressions  for  p.  x,  and  u  and 
rearranging  terms,  we  find 


ri(z)  + 


d{A(x)) 

dx 


lT 


X 


n(x)  + 


d  ( B(x)u ) 
dx 


lT 


n(x)  + 


d(a(x)) 

dx 


n(x) 


+  (n(x)A(x)  +  At(x)U{x)  -  U(x)B(x)R~1BT(x)U(x)  +  Q) 
+  b  —  Qy  +  II(x)a(x)  —  n(x)S(x)i?_1  (— w  +  w) 


+  A1  (x)  —  II  {x)B{x)R  1Bt(x)  + 


d{A{x)) 
dx 


-x 


+ 


d  (B(x)u) 
dx 


lT 


d(a(x)) 
dx 


lT\ 


6  =  0. 


(14) 


If  we  approximate  (see  [7])  by  assuming  that  the  derivative  terms 

d(A(x))  d(B(x)u)  d(a(x)) 
dx  ’  dx  ’  dx 

are  small  as  well  as  that  II  and  6  are  stationary,  it  follows  that  the  suboptimal  solution  to 
the  optimal  tracking  problem  is  given  by 


u  =  min  (max  (u,  u)  ,  u)  ,  (15) 

where  minimum  and  maximum  are  taken  component  wise  and 

u  =  —  R~1BT  (x)(Jl(x)x  +  b(x)).  (16) 
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It  should  be  emphasized  that  the  above  approximate  feedback  law  for  dynamic  multidrug 
treatment  is  only  suboptimal  since  the  formula  is  only  valid  if  both  controls  don’t  violate 
the  constraints  or  violate  the  constraint  simultaneously.  However,  our  numerical  studies 
suggest  that  the  feedback  law  works  reasonably  well  even  when  these  conditions  are  not 
met.  The  state  dependent  function  b(x)  is  given  by  the  expression 

b  =  ( At(x )  —  n (x)B(x)R~1B1  (x))  1  ( Qy  —  n(x)a(x)) , 

where  n(,x)  satisfies  the  state  dependent  Riccati  equation  (SDRE) 

At(x)U{x)  +  Yl{x)A(x)  -  U{x)B(x)R~1Bt{x)U{x)  +  Q  =  0. 

One  approach  involves  obtaining  the  solution  to  SDRE  is  by  use  of  symbolic  software 
packages  such  as  Macsyma  or  Mathematica.  However,  once  the  nonlinear  dynamics  of  the 
system  become  complex,  one  has  to  rely  on  numerical  approximation  techniques  to  obtain 
its  solution.  In  Sections  8  and  9,  we  will  discuss  the  numerical  procedure  that  we  used  to 
obtain  the  solution  of  the  SDRE. 

6  Compensator  Design 

For  the  synthesis  of  the  nonlinear  feedback  control  law  derived  in  Section  5,  full  knowledge 
of  all  the  state  variables  is  required.  However,  in  many  problems  of  practical  interest,  only 
partial  measurements  of  the  state  are  available.  In  this  section,  we  consider  the  problem  of 
designing  a  state  estimator  to  be  used  in  conjunction  with  the  nonlinear  feedback  control 
laws  described  earlier.  We  recall  that  the  dynamical  system  and  the  observation  are  given 
by 

x  =  f(x)  +  B(x)u 
z  =  Cx, 

where  the  output  matrix  C  is  defined  by  (3). 

As  in  the  linear  problem,  we  design  the  state  estimator  to  be  of  the  form 

Xe  =  fe(xe)  +  F(Cx,Xe), 

where  the  functions  fe  and  F  are  to  be  specified  later.  It  should  be  emphasized  that  the 
rationale  behind  the  state  estimator  is  that  it  is  indeed  the  state  estimator  xe  (and  not  the 
state  x)  that  is  to  be  used  in  the  nonlinear  feedback  control  laws  given  by  equations  (15)- 
(16).  Defining  the  error  function  between  the  state  and  the  state  estimator  as  e  =  x  —  xe 
and  taking  its  derivative,  we  obtain 

e  =  f(x)  +  B(x)u  -  fe(xe)  -  F(Cx,xe). 

If  we  choose  fe(xe)  =  f(xe)  +  B(xe)u  —  F(C xe,xe),  the  equation  for  the  derivative  of  the 
error  function  becomes 

e  =  (f(x)  +  B(x)u  -  F(Cx,  xe))  -  {f{xe)  +  B(xe)u  -  F(Cxe,  xe)) . 

We  next  parameterize  the  nonlinear  function  f(x)  as 

f(x)  =  A{x)x  +  a(x). 
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and  we  choose  F  to  be  the  product 

F(Cx,xe )  =  L(xe)Cx. 

The  derivative  of  the  error  then  becomes 

e  =  (A(x)x  +  a(x)  +  B(x)u  —  L(xe)Cx)  —  ( A(xe)xe  +  a(xe)  +  B(xe)u  —  L(xe)Cxe) 

=  (A(xe)  —  L(xe)C)  e  +  (B(x)  —  B(xe))u  +  A{x)x  —  A(xe)x  +  a(x)  —  a(xe ) 

=  (A(xe)  -  L(xe)C)  e  +  dB'dx^U^j  6  +  0  +  ~  ~  A(Xe)e  (17) 


A(xe)  +  ^2  ~  L(xe)c  \  e  +  O  (||e||2)  , 

\  i  e  / 


where  we  have  used  the  Taylor  series 


B(x)u  =  B(xe)u  +  ( ^2  — —Ui  |  e 


dxP 


O 


and 


f{x)  =  f(xe)  +  A{xe)e  +  O  (||e||2) 


The  subscript  -i  refers  to  the  ith  column  of  the  associated  matrix.  In  order  to  simplify  the 
notation,  we  define 


r  /  \  .  /  ,  x — \  dB.r  (xp ) 

A(xe)  =  A(xe)  +  ^2 - - ~ui- 


dxp 


(18) 


Since  the  goal  of  designing  a  state  estimator  is  that  it  approximates  the  true  state  of 
the  system  closely,  the  error  e  should  be  small.  Therefore,  from  equation  (17),  the  term 
^ A(xe )  —  L(xe)C ^  e  dominates  O  (||e||2).  For  this  reason  we  neglect  the  term  O  (||e||2) 

from  the  rest  of  our  discussion.  Now,  if  we  choose  L(xe )  so  that  the  eigenvalues  of  A(xe)  — 
L(xe)C  have  negative  real  parts,  the  estimation  error  e  will  converge  to  zero  asymptotically 
as  t  — >  oo  (hence,  xe  will  approach  x).  Since  A(xe)  —  L(xe)C  and  AT (xe)  —  CTLT(xe) 
have  the  same  eigenvalues,  we  can  design  the  compensator  gain  L(xe)  the  same  way  that 
we  design  the  feedback  gain  for  the  nonlinear  feedback  control  problem.  In  particular,  the 
compensator  gain  is  given  by 

Lr{xe)  =  N~1CT,(xe) 

or  equivalently 

L(xe)  =  S  {xe)CTN-\ 

where  T,(xe)  satisfies  the  state  estimator  dependent  Riccati  equation 

S(xe)iT(xe)  +  A(xe)Z(xe)  -  Yj(xe)CT N~1CF,(xe)  +  M  =  0.  (19) 


Here  M  is  a  symmetric  positive  semidefinite  matrix  and  N  is  a  symmetric  positive  definite 
matrix.  They  are  to  be  chosen  so  as  to  achieve  a  balance  between  desired  convergence 
properties  and  compensator  gain. 
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7  State  Equations  and  State  Estimator 

The  system  of  differential  equations  for  the  state  x  and  the  state  estimator  xe  are  given  by 


x  =  f(x)  +  B(x)u  +  ws(t) 

xe  =  f(xe)  +  B(xe)u  +  L(xe)(Cx  +  wm(t )  -  Cxe). 


(20) 


The  equations  for  x  and  xe  are  indeed  coupled  due  to  the  term  B(x)u  in  the  first  equation 
of  (20)  and  the  term  L(xe)Cx  in  the  second  equation  of  (20).  Therefore,  it  is  necessary  to 
solve  both  equations  together.  In  the  following  we  use  the  notation 

x=(D  and  G(x)  =  (  ,  ,  /W  +  ^W“ +  »,(()  \ 

\Xe)  \f(xe)  +  B(xe)u  +  L(xe)(Cx  +  Wm(t)  -  Cxe ) ) 


With  this  notation,  the  system  of  differential  equations  (20)  can  be  expressed  as 


x  =  G(x). 


8  Discretization  Method 

We  perform  the  time  discretization  using  backward  differentiation  formulas  (BDF)  [15] 
with  a  uniform  time  step  At.  The  solution  at  time  kAt  is  denoted  by  xA:.  Then  the 
discrete  form  of  (20)  using  BDF  formulas  is  given  by 

q-l 

xfc+1  =  «AtG(xfc+1)  +  ^&xfc-*, 

i=0 

where  q  is  the  degree  of  the  BDF  and  a  and  are  constants  that  depend  on  q.  The 
first-order  BDF  is  the  implicit  Euler  method  defined  by: 

<7=1,  a  =  1  and  fi\  =  1. 

The  second-order  BDF  is  denoted  by  BDF2  and  is  given  by: 

2  4  1 

q  =  2,  Pi  =  3  and  p2  =  --- 

With  the  BDF2  method  it  is  necessary  to  perform  the  first  step  of  integration  using  some 
other  numerical  method,  since  it  requires  the  solutions  at  two  previous  time  steps.  A 
common  choice  is  to  use  the  implicit  Euler  method  for  the  first  time  step.  It  can  be  shown 
that  this  does  not  reduce  the  order  of  accuracy.  It  is  well  known  that  the  implicit  Euler 
method  and  BDF2  have  good  stability  properties  [15]. 

Finally,  we  note  that,  at  each  time  step,  the  implicit  Euler  method  and  the  BDF2 
method  require  the  solution  of  system  of  nonlinear  equations 

q-l 

xA:+1  -  aAfG(xfc+1)  =  J2  Axfe“* 

i= 0 

for  xfc+1  in  terms  of  solutions  at  previous  time  steps,  xA:,  x^1,  etc. 
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9  Solution  to  the  Discrete  Nonlinear  Equations 

The  implicit  time  discretization  requires  us  to  solve  nonlinear  equations  of  the  form 

p- 1 

g(x)  =  x  —  aAfG(x)  —  ^/3,;xfc_*  =  0.  (21) 

i= o 

In  the  calculations  reported  on  here,  Newton’s  iterative  method  [19]  was  used  to  obtain 
the  solution  of  (21).  Let  the  ?'th  iterate  be  denoted  by  x^).  Then  Newton’s  method  solves 
for  the  (i  +  l)th  iterate  in  terms  of  the  ith  iterate  by 

Xb+D  _  XM  —  (I  —  aAtJ)_1g(x^), 
where  J  is  the  Jacobian  matrix  of  G. 

In  the  above  formula,  we  need  to  evaluate  G  at  x.  That  is,  we  must  calculate 

Q(x)  —  f  A(x)x  +  a(x)  +  B(x)u  +  ws(t)  \ 

X  \A(xe)xe  +  a(xe)  +  B(xe)u  +  L(xe)(Cx  +  wm(t)  -  Cxe ) J  ’ 

where  the  control  is  given  by 

u  =  min  (max  (u,  u)  ,  u)  , 

with 

u  =  - R~1BT(xe )  (u(xe)xe  +  6(xe))  . 

In  order  to  do  this  we  need  to  solve  the  state  dependent  Riccati  equation 

n(xe)A(xe)  +  AT(xe)Ii(xe)  -  U(xe)S{xe)U{xe)  +  Q  =  0, 

where  we  have  used  the  notation  S(xe)  =  B(xe)R~1BT(xe).  In  addition,  the  vector  func¬ 
tion  b(xe)  given  by 

b(xe)  =  ( AT(xe )  -  n(xc)5(xc))  1  (Qy  -  U(xe)a(xe)) . 

also  must  be  computed.  Finally,  to  evaluate  the  compensator  gain  L(xe),  which  is  given 
by 

L(xe)  =  V{xe)CTN-\ 
the  dual  state  dependent  Riccati  equation 

Z(xe)AT(xe)  +  i(xe)E(xc)  -  S(xe)C,TA”1C'S(xe)  +  M  =  0, 

with  A(xe)  given  by  (18)  also  must  be  solved. 

To  obtain  the  Jacobian  J  of  G,  one  can  derive  analytical  formulas  for  J  which  are  rather 
cumbersome.  In  practice,  it  is  more  convenient  to  use  a  finite  difference  approximation 
of  the  Jacobian  J.  For  our  application,  the  forward  difference  approximation  that  we 
employed  is  sufficiently  accurate  and  of  low  computational  cost. 

10  Simulation  Results 

The  time  period  for  simulations  is  500  days.  Unless  otherwise  stated  we  have  used  the  time 
step  At  =  1/96,  which  corresponds  to  fifteen  minutes.  The  discretization  was  performed 
with  the  second-order  backward  differentiation  formula  using  the  implicit  Euler  method  to 
compute  the  first  time  step. 
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10.1  Full  State  Feedback  Control 


We  first  present  simulation  results  where  we  assumed  that  full  state  observations  are  avail¬ 
able  for  feedback. 


10.1.1  Optimal  treatment  in  the  acute  state 

In  our  first  example,  we  assume  that  medication  starts  immediately  after  HIV  infection  (an 
unrealistic  assumption  in  practice-see  further  comments  below).  In  this  case,  the  initial 
values  for  the  state  variables  are  given  by  the  acute  state 


Ti  =  1000- 


V/  =  0.001- 


cells 

1 - 3’ 

mm-3 

virions 


T2  =  3.198- 


cells 


T,*  =  0.0- 


cells 


mm0 


Vni  =  0.0- 


mni'J 

virions 


mmJ 


n  =  0.0- 


cells 


mm° 


and 


E  =  0.01- 


cells 


(22) 


mmd 


mrrrJ 


In  this  example  we  also  assume  that  the  maximum  “efficacies”  to  be  e™ax  =  0.7  and 

emax  =  0  3 

Figure  1  depicts  the  dynamics  of  the  suboptimal  reverse  transcriptase  inhibitor  (RTI) 
control  ei  and  the  protease  inhibitor  (PI)  control  £2-  The  corresponding  state  progressions 
are  shown  in  Figure  2.  After  approximately  325  days  the  state  variables  have  reached  near 
equilibrium  and  start  to  oscillate  in  the  neighborhood  of  the  stable  “healthy”  steady  state 
(4)' 

After  325  days  the  feedback  control-based  treatment  calls  for  many  cycles  of  on  and 
off  therapy,  each  with  a  short  duration  of  few  hours.  This  type  of  protocol  is  difficult 
for  the  patients  to  follow  since,  for  example,  many  pills  need  to  be  taken  together  with 
food.  Therefore,  to  avoid  this  type  of  treatment  schedule  in  the  long  term  we  propose  to 
terminate  the  medication  when  the  viral  load  has  reached  a  sufficiently  low  level.  Thus  the 
medication  will  be  administered  only  when  the  condition 


Vj  +  Vni  >1-0 


virions 

mm3 


(23) 


is  satisfied.  Figure  3  presents  the  resulting  treatment  regimen  and  Figure  4  depicts  the 
corresponding  state  variables.  In  this  case,  even  though  it  does  take  longer  for  the  state  to 
settle  down  to  the  “healthy”  equilibrium  state,  the  results  confirm  that  once  the  system 
reaches  the  stable  equilibrium  state,  medication  is  no  longer  required.  We  remark  that 
because  of  the  condition  (23)  we  also  do  not  give  any  medication  during  the  first  two  days 
after  the  infection,  since  during  this  time  the  viral  load  is  below  our  limit.  In  practice,  the 
first  treatment  would  be  even  further  delayed  since  one  would  almost  never  be  aware  of 
the  infection  early  on  or  even  when  the  lower  load  level  in  (23)  is  first  exceeded. 

So  far  in  our  numerical  studies  discussed  above,  we  used  the  maximum  efficacies  e™ax  = 
0.7  and  e™ax  =  0.3.  An  important  and  interesting  question  is  how  low  these  efficacies  can 
be  in  order  for  state  of  the  system  to  be  able  to  reach  the  “healthy”  equilibrium  state. 
In  our  previous  work  [2],  the  analysis  revealed  that  without  any  medication  the  state  will 
converge  to  the  unhealthy  steady  state  (24)  given  below.  Thus,  there  should  be  a  threshold 
boundary  in  the  (e™ax,  e™8^)  efficacy  plane  which  indicates  whether  it  is  possible  or  not  to 
reach  the  “healthy”  equilibrium  state  with  any  admissible  drug  therapy. 

We  investigated  the  influence  of  the  maximum  efficacies  by  varying  both  of  them  from 
zero  to  one  and  carrying  out  a  series  of  numerical  simulations  with  the  nonlinear  feedback 
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Figure  1:  Multidrug  therapy  starting  from  the  acute  state. 


controls  for  each  of  these  combinations  of  bounds.  Figure  5  depicts  the  immune  effector  E 
after  500  days  for  these  simulations.  It  is  noted  that  in  the  white  region  of  Figure  5,  the 
immune  effector  is  around  350-^1.  Hence,  for  the  values  of  the  maximum  efficacies  in  this 
region  the  nonlinear  suboptimal  feedback  control  will  bring  the  state  of  the  system  to  the 
“healthy”  equilibrium  state. 

Finally,  from  Figure  5  we  can  observe  a  slight  apparent  lack  of  robustness  of  the  pro¬ 
posed  nonlinear  feedback  control  law.  This  is  probably  a  consequence  of  the  local  lineariza¬ 
tion  of  the  nonlinear  dynamics  or  the  suboptimal  nature  of  the  approximate  nonlinear 
controller.  For  example,  when  the  maximum  efficacies  are  e“ax  =  0.8  and  e™ax  =  0.0,  the 
“healthy”  equilibrium  state  is  reachable.  However,  when  the  maximum  efficacy  e™ax  is 
increased  by  0.025,  the  “healthy”  equilibrium  state  is  no  longer  reachable. 


10.1.2  Optimal  treatment  in  the  unhealthy  steady  state 

In  our  second  set  of  examples,  we  consider  a  patient  who  has  not  taken  medication  after 
HIV  infection  and  the  disease  progression  is  proceeding  towards  the  unhealthy  steady  state 
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(24) 


Again  the  question  is  whether  our  suboptimal  approximate  nonlinear  feedback  controller 
can  transfer  the  system  from  the  unhealthy  state  (24)  to  the  “healthy”  equilibrium  state  (4) 
in  finite  time.  We  begin  by  letting  the  maximum  efficacies  be  e™ax  =  0.7  and  e 2iax  =  0.3. 
The  dynamics  of  the  suboptimal  treatment  strategy  proposed  by  the  nonlinear  feedback 
control  are  shown  in  Figure  6  and  the  corresponding  state  variables  are  plotted  in  Figure 
7.  Clearly,  in  this  case  the  “healthy”  equilibrium  state  is  not  reachable. 
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Figure  2:  Plots  of  the  state  variables  with  the  y-axis  in  log10-scale. 

Next  we  increase  the  maximum  efficacy  e™3*  of  the  reverse  transcriptase  inhibitor  to 
be  0.75.  The  resulting  optimal  therapy  regimen  is  depicted  in  Figure  8.  The  therapy  does 
bring  the  system  to  the  “healthy”  equilibrium  state  as  can  be  seen  in  Figure  9.  It  is  noted 
that  we  discontinued  the  medication  using  the  viral  load  condition  (23)  as  proposed  earlier. 

As  in  the  earlier  example  for  acute  infection,  we  investigated  the  combination  of  the 
maximum  efficacies  e“iax  and  e™ax  so  that  “healthy”  equilibrium  state  is  reachable.  Figure 
10  depicts  the  immune  effector  E  after  500  days  as  a  function  of  maximum  efficacies.  We 
also  plotted  in  Figure  11  contour  lines  of  the  so-called  “total  combined  drug  efficacy”  (some 
measure  of  the  combined  control  level) 

e  =  1  -  (1  -  ei)(l  -  e2) 

as  defined  in  [12].  The  threshold  boundary  in  Figure  10  approximates  very  well  the  80% 
total  efficacy  curve  given  by  the  equation  e  =  0.8. 

Figure  10  demonstrates  that  slightly  smaller  maximum  efficacies  are  required  than  for 
the  acute  state  in  Figure  5.  This  again  suggests  a  slight  lack  of  robustness  in  the  nonlinear 
feedback  control  for  the  acute  state.  It  is  most  likely  that  the  approximations  are  not 
sufficiently  accurate  for  the  highly  nonlinear  transient  period  observed  after  the  infection 
in  the  acute  state  case. 
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Figure  3:  The  medication  levels  starting  from  the  acute  state  using  the  stopping  criterion 
based  on  the  viral  load. 

10.2  Feedback  Control  and  State  Estimation 

In  this  section,  we  consider  the  design  and  synthesis  of  nonlinear  feedback  control-based 
treatments  in  which  one  only  has  partial  state  observations. 

10.2.1  Inaccurate  initial  condition  for  the  state  estimator 

We  consider  first  the  acute  infection  case  where  the  initial  value  for  the  state  x  is  given  by 
(22).  We  disturbed  the  initial  value  of  the  state  estimator  xe  by  20%  so  that  xe  =  0.8x. 
As  in  our  earlier  computational  experiment,  the  maximum  efficacies  are  e™ax  =  0.7  and 
e““  =  0.3.  For  the  computation  of  the  estimator,  the  weighting  matrices  in  (19)  are  chosen 
to  be 

M  =  0.01/2x2  and  M  =  Ijx7  > 

where  In  denotes  the  l  x  l  identity  matrix.  During  the  100  first  days  in  the  simulation  the 
estimator  xe  has  tendency  to  have  negative  components  which  is  not  feasible.  In  such  a 
case,  we  project  the  negative  components  to  be  zero. 

Figure  12  records  the  suboptinral  treatment  protocol  obtained  using  the  estimator  xe 
in  the  nonlinear  feedback  control  laws.  The  therapy  dynamics  are  fairly  similar  to  the 
ones  shown  in  Figure  3.  Figure  13  depicts  the  Euclidean  distance  between  the  state  and 
the  estimator.  This  plot  clearly  indicates  that  the  state  estimator  converges  to  the  state 
asymptotically.  In  Figure  14  we  plot  the  state  variables  and  their  estimators  for  the  first 
50  days.  After  50  days  it  is  difficult  to  discern  the  differences  between  the  state  and  the 
estimator  in  such  plots  and  they  are  omitted. 

10.2.2  Noisy  measurements 

We  next  consider  the  suboptinral  feedback  controller  in  the  presence  of  noise  in  the  obser¬ 
vations.  At  the  kth  time  step,  we  denote  the  measurement  without  noise  and  the  noise 
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Figure  4:  Plots  of  the  state  variables  using  the  stopping  criterion  based  on  the  viral  load 
with  the  y-axis  in  log10-scale. 


by  Cxk  and  wkn.  respectively.  We  assume  the  noise  to  be  lognormally  distributed  without 
correlation  between  the  components.  Furthermore,  we  assume  the  noise  has  zero  mean  and 
the  variance  of  the  zth  component  is  taken  to  be  af(Cxk)i.  The  assays  measuring  CD4 
cell  counts  and  viral  loads  typically  have  this  type  of  measurement  errors  [27,  32].  Under 
these  assumptions  the  ith  component  of  the  noise  is  given  by 

=  exp  (V  ^log  {Cxk)i  -  X-  log  (of  +  1),  log(of  +  1)^  -  (' Cxk)u 

where  A /"(•,•)  is  the  normal  distribution  with  the  mean  given  by  the  first  argument  and 
the  variance  given  by  the  second  argument.  In  the  following  experiment  the  parameter  of 
is  chosen  as  of  =  (0.10)2.  The  measurements  without  the  noise  and  with  the  noise  are 
graphed  in  Figure  15  where  the  observation  sampling  time  is  two  hours. 

In  order  to  make  the  estimator  xe  sufficient  for  reasonable  control,  it  appears  to  be 
necessary  to  reduce  the  sampling  time  increments.  For  the  next  results,  we  used  fifteen 
second  sampling  time  steps,  that  is,  At  =  1/5760.  We  used  the  same  initial  values  as  in 
the  case  without  noise,  that  is,  the  state  corresponds  to  the  acute  infection  and  the  state 
estimator  initial  value  is  20%  off. 

Figure  16  depicts  the  optimal  treatment  strategy  with  imperfect  measurements.  Be¬ 
cause  of  the  noise  the  controls  contain  some  oscillations  from  250  days  to  320  days.  The 
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Figure  5:  The  immune  effector  E  after  500  days  after  HIV  infection  for  different  maximum 
efficacies  e?13"*  and  eVax.  White  and  black  colors  corresponds  to  values  around  370-^%  and 

i  z  1  mm*3 

0.02 -^f,  respectively. 

mm*3  ’  1  J 


short  interruptions  in  the  medications  during  the  143th  day  and  the  305th  day  are  probably 
due  to  lack  of  robustness  in  our  Newton  solver  and  are  thus  an  artifact  of  our  computational 
schemes.  In  Figure  17  we  plot  the  state  variables  and  their  estimators  for  the  entire  500 
days  and  the  Euclidean  distance  between  them.  Note  that  the  estimators  approximate  very 
well  the  true  states  of  the  system  except  at  the  beginning  and  the  end  of  the  simulations. 
Despite  these  discrepancies,  the  suboptimal  dynamic  multidrug  therapy  is  still  very  similar 
to  the  one  without  noise  in  observations  depicted  in  Figure  12. 
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Figure  10:  The  immune  effector  E  after  500  days  after  starting  medication  on  the  unhealthy 
state  for  different  maximum  efficacies  e°iax  and  White  and  black  colors  corresponds 

to  values  around  370-^^!  and  0.02-^|,  respectively. 
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Figure  11:  Contour  lines  of  the  “total  combined  drug  efficacy”  e  =  1  —  (1  —  ei)(l  —  £2). 
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Figure  12:  The  suboptimal  dynamics  treatment  based  on  the  state  estimator. 


Figure  13:  The  Euclidean  distance  between  the  state  and  state  estimator  with  the  y-axis 
in  log10-scale. 


Figure  14:  Plots  of  the  state 
with  the  y-axis  in  log10-scale. 


50  days 


11  Conclusions 


We  presented  and  used  techniques  and  ideas  from  control  theory  to  design  and  synthesize 
nonlinear  feedback  control-based  treatment  regimes  for  HIV.  The  mathematical  model  for 
HIV  progression  includes  compartments  for  target  cells,  infected  cells,  virus,  and  immune 
response  that  are  subjected  to  multiple  (RTI-  and  PI-  type)  drug  treatments  as  controllers. 
We  have  demonstrated  through  numerical  simulations  that  by  using  a  “target  tracking” 
approach,  suboptimal  feedback-based  treatment  strategies  can  be  designed  to  move  the 
state  of  the  system  from  an  “unhealthy”  state  (high  virus  load  and  low  immune  response) 
to  a  “healthy”  one  (with  low  viral  load  and  high  immune  effector  levels).  An  important 
advantage  of  this  drug  regimen  design  is  that  once  the  viral  load  is  controlled  to  very  low 
levels,  the  drug  dosage  can  be  reduced  or  completely  terminated.  Consequently,  long  term 
pharmaceutical  side  effects  could  also  be  reduced.  Thus,  this  approach  suggests  that  by 
anticipating  and  responding  to  the  disease  progression,  dynamic  feedback  strategies  such 
as  those  designed  in  this  paper  could  lead  to  long-term  control  of  HIV  after  discontinuation 
of  therapy. 
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